Tsinghua University and Tencent Hunyuan Win the MLSys2026 MoE Inference Challenge with a 4.1x Speedup on NPU
The Tsinghua University Storage Lab and the Tencent Hunyuan AI Infra team won the global championship in the MLSys2026 MoE Model Inference Optimization Challenge. To address the inference bottlenecks of the trillion-parameter mixture-of-experts (MoE) architecture on heterogeneous NPUs, the joint team designed a full-chain optimization solution, including the E-Shard strategy, PSUM three-dimensional tensor batch reading, and GEMV path, significantly improving performance.